Search CORE

455 research outputs found

They are Not Equally Reliable: Semantic Event Search Using Differentiated Concept Classifiers

Author: Chang X
Xing EP
Yang Y
Yu YL
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 09/12/2016
Field of study

© 2016 IEEE. Complex event detection on unconstrained Internet videos has seen much progress in recent years. However, state-of-the-art performance degrades dramatically when the number of positive training exemplars falls short. Since label acquisition is costly, laborious, and time-consuming, there is a real need to consider the much more challenging semantic event search problem, where no example video is given. In this paper, we present a state-of-the-art event search system without any example videos. Relying on the key observation that events (e.g. dog show) are usually compositions of multiple mid-level concepts (e.g. 'dog,' 'theater,' and 'dog jumping'), we first train a skip-gram model to measure the relevance of each concept with the event of interest. The relevant concept classifiers then cast votes on the test videos but their reliability, due to lack of labeled training videos, has been largely unaddressed. We propose to combine the concept classifiers based on a principled estimate of their accuracy on the unlabeled test videos. A novel warping technique is proposed to improve the performance and an efficient highly-scalable algorithm is provided to quickly solve the resulting optimization. We conduct extensive experiments on the latest TRECVID MEDTest 2014, MEDTest 2013 and CCV datasets, and achieve state-of-the-art performances

Crossref

OPUS - University of Technology Sydney

Complex event detection using semantic saliency and nearly-isotonic SVM

Author: Chang X
Xing EP
Yang Y
Yu YL
Publication venue
Publication date: 01/01/2015
Field of study

Copyright © 2015 by the author(s). We aim to detect complex events in long Internet videos that may last for hours. A major challenge in this setting is that only a few shots in a long video are relevant to the event of interest while others are irrelevant or even misleading. Instead of indifferently pooling the shots, we first define a novel notion of semantic saliency that assesses the relevance of each shot with the event of interest. We then prioritize the shots according to their saliency scores since shots that are semantically more salient are expected to contribute more to the final event detector. Next, we propose a new isotonic regularizer that is able to exploit the semantic ordering information. The resulting nearly-isotonic SVM classifier exhibits higher discriminative power. Computationally, we develop an efficient implementation using the proximal gradient algorithm, and we prove new, closed-form proximal steps. We conduct extensive experiments on three real-world video datasets and confirm the effectiveness of the proposed approach

CiteSeerX

OPUS - University of Technology Sydney

Network Analysis of Breast Cancer Progression and Reversal Using a Tree-Evolving Network Algorithm

Author: Becker-Weimann S
Bissell M
Curtis RE
Kuhn I
Parikh AP
Wu W
Xing EP
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2014
Field of study

The HMT3522 progression series of human breast cells have been used to discover how tissue architecture, microenvironment and signaling molecules affect breast cell growth and behaviors. However, much remains to be elucidated about malignant and phenotypic reversion behaviors of the HMT3522-T4-2 cells of this series. We employed a "pan-cell-state" strategy, and analyzed jointly microarray profiles obtained from different state-specific cell populations from this progression and reversion model of the breast cells using a tree-lineage multi-network inference algorithm, Treegl. We found that different breast cell states contain distinct gene networks. The network specific to non-malignant HMT3522-S1 cells is dominated by genes involved in normal processes, whereas the T4-2-specific network is enriched with cancer-related genes. The networks specific to various conditions of the reverted T4-2 cells are enriched with pathways suggestive of compensatory effects, consistent with clinical data showing patient resistance to anticancer drugs. We validated the findings using an external dataset, and showed that aberrant expression values of certain hubs in the identified networks are associated with poor clinical outcomes. Thus, analysis of various reversion conditions (including non-reverted) of HMT3522 cells using Treegl can be a good model system to study drug effects on breast cancer. © 2014 Parikh et al

Directory of Open Access Journals

PubMed Central

D-Scholarship@Pitt

DiCE: The Infinitely Differentiable Monte-Carlo Estimator

Author: Al-Shedivat M
Farquhar G
Foerster J
Rocktäschel T
Whiteson S
Xing EP
Publication venue: Proceedings of Machine Learning Research
Publication date: 01/07/2018
Field of study

The score function estimator is widely used for estimating gradients of stochastic objectives in stochastic computation graphs (SCG), eg, in reinforcement learning and meta-learning. While deriving the first-order gradient estimators by differentiating a surrogate loss (SL) objective is computationally and conceptually simple, using the same approach for higher-order derivatives is more challenging. Firstly, analytically deriving and implementing such estimators is laborious and not compliant with automatic differentiation. Secondly, repeatedly applying SL to construct new objectives for each order derivative involves increasingly cumbersome graph manipulations. Lastly, to match the first-order gradient under differentiation, SL treats part of the cost as a fixed sample, which we show leads to missing and wrong terms for estimators of higher-order derivatives. To address all these shortcomings in a unified way, we introduce DiCE, which provides a single objective that can be differentiated repeatedly, generating correct estimators of derivatives of any order in SCGs. Unlike SL, DiCE relies on automatic differentiation for performing the requisite graph manipulations. We verify the correctness of DiCE both through a proof and numerical evaluation of the DiCE derivative estimates. We also use DiCE to propose and evaluate a novel approach for multi-agent learning. Our code is available at https://www.github.com/alshedivat/lola

UCL Discovery

Temporal Model Adaptation for Person Re-Identification

Author: AJ Joshi
B Settles
D Tao
D Tao
EP Xing
G Chechik
G Lisanti
H Xia
J Chen
J García
KQ Weinberger
M Hirzer
M Pavan
N Martinel
N Martinel
N Martinel
Peter M. Roth
R Johnson
R Vezzani
R Zhang
S Boyd
WS Zheng
WS Zheng
Xiaochun Cao
Z Wang
Z Wu
ZC Guo
Publication venue
Publication date: 25/07/2016
Field of study

Person re-identification is an open and challenging problem in computer vision. Majority of the efforts have been spent either to design the best feature representation or to learn the optimal matching metric. Most approaches have neglected the problem of adapting the selected features or the learned model over time. To address such a problem, we propose a temporal model adaptation scheme with human in the loop. We first introduce a similarity-dissimilarity learning method which can be trained in an incremental fashion by means of a stochastic alternating directions methods of multipliers optimization procedure. Then, to achieve temporal adaptation with limited human effort, we exploit a graph-based approach to present the user only the most informative probe-gallery matches that should be used to update the model. Results on three datasets have shown that our approach performs on par or even better than state-of-the-art approaches while reducing the manual pairwise labeling effort by about 80%

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università degli Studi di Udine

Serum diagnosis of diffuse large B-cell lymphomas and further identification of response to therapy using SELDI-TOF-MS and tree analysis patterning

Author: A Vlahou
A Vlahou
BL Adam
Bo Wang
C P.
CP Paweletz
DA Holterman
DE van der Merwe
EF Petricoin
EF Petricoin 3rd
EP Diamandis
EP Diamandis
EP Diamandis
EP Diamandis
G Wu
GL Wright Jr.
H Hong
H Zhang
HJ Issaq
IS Lossos
IS Lossos
J Marshall
JD Wulfkuhle
JM Sorace
JS Abramson
KA Baggerly
KA Baggerly
L Breiman
L Miguet
LL Banez
MA Shipp
OJ Semmes
PC Walsh
RB Wilder
Wen-qi Jiang
Xiao-shi Zhang
Xing Zhang
Z Lin
Zhi-ming Li
Zhong-zhen Guan
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

CSMET: Comparative Genomic Motif Detection via Multi-Resolution Phylogenetic Shadowing

Author: A Sandelin
A Siepel
AC Siepel
AC Siepel
AM Moses
AM Moses
AM Moses
BE Engelhardt
C Bergman
C Boutilier
CM Bergman
D Boffelli
DA Papatsenko
EH Margulies
EP Xing
EP Xing
Eric P. Xing
GE Crooks
GJ Olsen
I Dubchak
J Felsenstein
J Felsenstein
J Felsenstein
J Pedersen
JD McAuliffe
M Blanchette
M Blanchette
M Blanchette
M Hasegawa
M Tompa
MC Frith
Mladen Kolar
MR Kantorovitz
MZ Ludwig
MZ Ludwig
MZ Ludwig
Pradipta Ray
PV Benos
R Siddharthan
RG Cowell
S Sinha
S Sinha
SB Montgomery
Suyash Shringarpure
T Wang
TH Jukes
Uwe Ohler
W Huang
Publication venue: Public Library of Science
Publication date: 01/06/2008
Field of study

Functional turnover of transcription factor binding sites (TFBSs), such as whole-motif loss or gain, are common events during genome evolution. Conventional probabilistic phylogenetic shadowing methods model the evolution of genomes only at nucleotide level, and lack the ability to capture the evolutionary dynamics of functional turnover of aligned sequence entities. As a result, comparative genomic search of non-conserved motifs across evolutionarily related taxa remains a difficult challenge, especially in higher eukaryotes, where the cis-regulatory regions containing motifs can be long and divergent; existing methods rely heavily on specialized pattern-driven heuristic search or sampling algorithms, which can be difficult to generalize and hard to interpret based on phylogenetic principles. We propose a new method: Conditional Shadowing via Multi-resolution Evolutionary Trees, or CSMET, which uses a context-dependent probabilistic graphical model that allows aligned sites from different taxa in a multiple alignment to be modeled by either a background or an appropriate motif phylogeny conditioning on the functional specifications of each taxon. The functional specifications themselves are the output of a phylogeny which models the evolution not of individual nucleotides, but of the overall functionality (e.g., functional retention or loss) of the aligned sequence segments over lineages. Combining this method with a hidden Markov model that autocorrelates evolutionary rates on successive sites in the genome, CSMET offers a principled way to take into consideration lineage-specific evolution of TFBSs during motif detection, and a readily computable analytical form of the posterior distribution of motifs under TFBS turnover. On both simulated and real Drosophila cis-regulatory modules, CSMET outperforms other state-of-the-art comparative genomic motif finders

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Trade-offs in Large-Scale Distributed Tuplewise Estimation and Learning

Author: A Lee
A Van Der Vaart
EP Xing
G Blom
J Dean
M Jordan
P Bertail
P Carbone
R Bekkerman
S Bubeck
S Clémençon
S Clémençon
S Clémençon
S Clémençon
SP Boyd
V de la Pena
V Smith
W Hoeffding
Publication venue
Publication date: 21/06/2019
Field of study

The development of cluster computing frameworks has allowed practitioners to scale out various statistical estimation and machine learning algorithms with minimal programming effort. This is especially true for machine learning problems whose objective function is nicely separable across individual data points, such as classification and regression. In contrast, statistical learning tasks involving pairs (or more generally tuples) of data points - such as metric learning, clustering or ranking do not lend themselves as easily to data-parallelism and in-memory computing. In this paper, we investigate how to balance between statistical performance and computational efficiency in such distributed tuplewise statistical problems. We first propose a simple strategy based on occasionally repartitioning data across workers between parallel computation stages, where the number of repartitioning steps rules the trade-off between accuracy and runtime. We then present some theoretical results highlighting the benefits brought by the proposed method in terms of variance reduction, and extend our results to design distributed stochastic gradient descent algorithms for tuplewise empirical risk minimization. Our results are supported by numerical experiments in pairwise statistical estimation and learning on synthetic and real-world datasets.Comment: 23 pages, 6 figures, ECML 201

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Biomarker discovery and redundancy reduction towards classification using a multi-factorial MALDI-TOF MS T2DM mouse model dataset

Author: A Chadt
A Colorni
A Gamez-Pozo
A Rasche
A Tiss
A Tiss
AC Sauve
AL Oberg
Alexandra Chadt
Ali Tiss
B Wu
C Bauer
C Mercier
C Yang
Celia J Smith
Chris Bauer
D Kwon
D Mantini
DB West
Dieter Beule
E Lange
EP Xing
Frank Kleinjung
G Ge
GK Smyth
H Ressom
Hadi Al-Hasani
HS Jurgens
HS Jürgens
I Guyon
J Hua
J McGuire
J Norris
J Voortman
JE Shaw
JF Timms
JL Rodgers
Johannes Schuchhardt
Johnson RAaBGK
JR Ortlepp
K Coombes
Knut Reinert
L Breiman
M Dorigo
M Kirchner
M Palmblad
M Sturm
Mark W Towers
ME de Noo
MJ Crawley
MP van der Werff
N Tiffin
O Kohlbacher
P Du
P Pratapa
P Zhang
PV Rao
Q Liu
R Aebersold
R Cramer
Rainer Cramer
RC Gentleman
Robert Gentleman and Vince Carey and Wolfgang Huber and Rafael Irizarry and Sandrine Dudoit (Ed)
SM Carlson
T Alexandrov
T Dreja
T Hastie
Tanja Dreja
W Yu
X Liu
X Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Diabetes like many diseases and biological processes is not mono-causal. On the one hand multifactorial studies with complex experimental design are required for its comprehensive analysis. On the other hand, the data from these studies often include a substantial amount of redundancy such as proteins that are typically represented by a multitude of peptides. Coping simultaneously with both complexities (experimental and technological) makes data analysis a challenge for Bioinformatics

Central Archive at the University of Reading

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Constructing Biological Pathways by a Two-Step Counting Approach

Author: A Agresti
AP Dempster
B Liu
BM Buehrer
CJ Roberts
D Sahoo
DC Weaver
DJ Allocco
EP van Someren
EP Xing
F Ay
F Markowetz
F Posas
FV Jensen
G Zhu
H Kim
H Wang
H Wang
H Wang
HD Madhani
Henry Horng-Shing Lu
Hsiuying Wang
I Shmulevich
J Hubble
J Pearl
LM Li
MI Davidich
Miguel A. Blazquez
P D'haeseleer
P Li
PJ Bickel
PT Spellman
S Liang
SA Kauffman
SA Kauffman
SM O'Rourke
ST Jensen
T Akutsu
T Chen
Tung-Hung Chueh
Z Wei
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Networks are widely used in biology to represent the relationships between genes and gene functions. In Boolean biological models, it is mainly assumed that there are two states to represent a gene: on-state and off-state. It is typically assumed that the relationship between two genes can be characterized by two kinds of pairwise relationships: similarity and prerequisite. Many approaches have been proposed in the literature to reconstruct biological relationships. In this article, we propose a two-step method to reconstruct the biological pathway when the binary array data have measurement error. For a pair of genes in a sample, the first step of this approach is to assign counting numbers for every relationship and select the relationship with counting number greater than a threshold. The second step is to calculate the asymptotic p-values for hypotheses of possible relationships and select relationships with a large p-value. This new method has the advantages of easy calculation for the counting numbers and simple closed forms for the p-value. The simulation study and real data example show that the two-step counting method can accurately reconstruct the biological pathway and outperform the existing methods. Compared with the other existing methods, this two-step method can provide a more accurate and efficient alternative approach for reconstructing the biological network

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central